A Novel Neighborhood‐Weighted Sampling Method for Imbalanced Datasets

نویسندگان

چکیده

The weighted sampling methods based on k-nearest neighbors have been demonstrated to be effective in solving the class imbalance problem. However, they usually ignore positional relationship between a sample and heterogeneous samples its neighborhood when calculating weight. This paper proposes novel neighborhood-weighted method named NWBBagging improve Bagging algorithm's performance imbalanced datasets. It considers center identifying critical samples. And parameter reduction is proposed combined into ensemble learning framework, which reduces parameters increases classifier's diversity. We compare with some state-of-the-art algorithms 34 datasets, result shows that achieves better performance.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Margin-Based Over-Sampling Method for Learning from Imbalanced Datasets

Learning from imbalanced datasets has drawn more and more attentions from both theoretical and practical aspects. Over-sampling is a popular and simple method for imbalanced learning. In this paper, we show that there is an inherently potential risk associated with the oversampling algorithms in terms of the large margin principle. Then we propose a new synthetic over sampling method, named Mar...

متن کامل

Imbalanced Datasets: from Sampling to Classifiers

Classification is one of the most fundamental tasks in the machine learning and data-mining communities. One of the most common challenges faced when trying to perform classification is the class imbalance problem. A dataset is considered imbalanced if the class of interest (positive or minority class) is relatively rare as compared to the other classes (negative or majority classes). As a resu...

متن کامل

A Cost-Sensitive Ensemble Method for Class-Imbalanced Datasets

and Applied Analysis 3 costs for the positive and negative classes, SVM can be extended to the cost-sensitive setting by introducing an additional parameter that penalizes the errors asymmetrically. Consider that we have a binary classification problem, which is represented by a data set {(x 1 , y 1 ), (x 2 , y 2 ), . . . , (x l , y l )}, where x i ⊂ R represents a k-dimensional data point and ...

متن کامل

A Novel One Sided Feature Selection Method for Imbalanced Text Classification

The imbalance data can be seen in various areas such as text classification, credit card fraud detection, risk management, web page classification, image classification, medical diagnosis/monitoring, and biological data analysis. The classification algorithms have more tendencies to the large class and might even deal with the minority class data as the outlier data. The text data is one of t...

متن کامل

Handling imbalanced datasets: A review

Learning classifiers from imbalanced or skewed datasets is an important topic, arising very often in practice in classification problems. In such problems, almost all the instances are labelled as one class, while far fewer instances are labelled as the other class, usually the more important class. It is obvious that traditional classifiers seeking an accurate performance over a full range of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Chinese Journal of Electronics

سال: 2022

ISSN: ['1022-4653', '2075-5597']

DOI: https://doi.org/10.1049/cje.2021.00.121